在实际应用中,尽管这种知识对于确定反应性控制系统与环境的精确相互作用很重要,但我们很少可以完全观察到系统的环境。因此,我们提出了一种在部分可观察到的环境中进行加固学习方法(RL)。在假设环境的行为就像是可观察到的马尔可夫决策过程,但我们对其结构或过渡概率不了解。我们的方法将Q学习与IOALERGIA结合在一起,这是一种学习马尔可夫决策过程(MDP)的方法。通过从RL代理的发作中学习环境的MDP模型,我们可以在不明确的部分可观察到的域中启用RL,而没有明确的记忆,以跟踪以前的相互作用,以处理由部分可观察性引起的歧义。相反,我们通过模拟学习环境模型上的新体验以跟踪探索状态,以抽象环境状态的形式提供其他观察结果。在我们的评估中,我们报告了方法的有效性及其有希望的性能,与六种具有复发性神经网络和固定记忆的最先进的深度RL技术相比。
translated by 谷歌翻译
In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardware implementation of nonlinear activation functions. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware. The performance in Q-factor is presented for the cases of bidirectional long-short-term memory coupled with convolutional NN (biLSTM + CNN) equalizer, CNN equalizer, and standard 1-StpS digital back-propagation (DBP) for the simulation and experiment propagation of a single channel dual-polarization (SC-DP) 16QAM at 34 GBd along 17x70km of LEAF. The biLSTM+CNN equalizer provides a similar result to DBP and a 1.7 dB Q-factor gain compared with the chromatic dispersion compensation baseline in the experimental dataset. After that, we assess the Q-factor and the impact of hardware utilization when approximating the activation functions of NN using Taylor series, piecewise linear, and look-up table (LUT) approximations. We also show how to mitigate the approximation errors with extra training and provide some insights into possible gradient problems in the LUT approximation. Finally, to evaluate the complexity of hardware implementation to achieve 400G throughput, fixed-point NN-based equalizers with approximated activation functions are developed and implemented in an FPGA.
translated by 谷歌翻译
To circumvent the non-parallelizability of recurrent neural network-based equalizers, we propose knowledge distillation to recast the RNN into a parallelizable feedforward structure. The latter shows 38\% latency decrease, while impacting the Q-factor by only 0.5dB.
translated by 谷歌翻译
在本文中,提出了一种新的方法,该方法允许基于神经网络(NN)均衡器的低复杂性发展,以缓解高速相干光学传输系统中的损伤。在这项工作中,我们提供了已应用于馈电和经常性NN设计的各种深层模型压缩方法的全面描述和比较。此外,我们评估了这些策略对每个NN均衡器的性能的影响。考虑量化,重量聚类,修剪和其他用于模型压缩的尖端策略。在这项工作中,我们提出并评估贝叶斯优化辅助压缩,其中选择了压缩的超参数以同时降低复杂性并提高性能。总之,通过使用模拟和实验数据来评估每种压缩方法的复杂性及其性能之间的权衡,以完成分析。通过利用最佳压缩方法,我们表明可以设计基于NN的均衡器,该均衡器比传统的数字背部传播(DBP)均衡器具有更好的性能,并且只有一个步骤。这是通过减少使用加权聚类和修剪算法后在NN均衡器中使用的乘数数量来完成的。此外,我们证明了基于NN的均衡器也可以实现卓越的性能,同时仍然保持与完整的电子色色散补偿块相同的复杂性。我们通过强调开放问题和现有挑战以及未来的研究方向来结束分析。
translated by 谷歌翻译
FPGA中首次实施了针对非线性补偿的经常性和前馈神经网络均衡器,其复杂度与分散均衡器的复杂度相当。我们证明,基于NN的均衡器可以胜过1个速度的DBP。
translated by 谷歌翻译
生成对抗性网络(GANS)的最新进展导致了面部图像合成的显着成果。虽然使用基于样式的GAN的方法可以产生尖锐的照片拟真的面部图像,但是通常难以以有意义和解开的方式控制所产生的面的特性。之前的方法旨在在先前培训的GaN的潜在空间内实现此类语义控制和解剖。相比之下,我们提出了一个框架,即明确地提出了诸如3D形状,反玻璃,姿势和照明的面部的身体属性,从而通过设计提供解剖。我们的方法,大多数GaN,与非线性3D可变模型的物理解剖和灵活性集成了基于风格的GAN的表现力和质感,我们与最先进的2D头发操纵网络相结合。大多数GaN通过完全解散的3D控制来实现肖像图像的照片拟理性操纵,从而实现了光线,面部表情和姿势变化的极端操作,直到完整的档案视图。
translated by 谷歌翻译
灵巧的操纵仍然是机器人技术中的一个空缺问题。为了协调研究界为解决这个问题的努力,我们提出了共同的基准。我们设计和构建了机器人平台,该平台托管在MPI上供智能系统托管,可以远程访问。每个平台由三个能够敏捷物体操纵的机器人手指组成。用户能够通过提交自动执行的代码(类似于计算群集)来远程控制平台。使用此设置,i)我们举办机器人竞赛,来自世界任何地方的团队访问我们的平台以应对具有挑战性的任务ii)我们发布了在这些比赛中收集的数据集(包括数百个机器人小时),而我们为研究人员提供了访问自己项目的这些平台。
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译
Modelling and forecasting real-life human behaviour using online social media is an active endeavour of interest in politics, government, academia, and industry. Since its creation in 2006, Twitter has been proposed as a potential laboratory that could be used to gauge and predict social behaviour. During the last decade, the user base of Twitter has been growing and becoming more representative of the general population. Here we analyse this user base in the context of the 2021 Mexican Legislative Election. To do so, we use a dataset of 15 million election-related tweets in the six months preceding election day. We explore different election models that assign political preference to either the ruling parties or the opposition. We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods. These results demonstrate that analysis of public online data can outperform conventional polling methods, and that political analysis and general forecasting would likely benefit from incorporating such data in the immediate future. Moreover, the same Twitter dataset with geographical attributes is positively correlated with results from official census data on population and internet usage in Mexico. These findings suggest that we have reached a period in time when online activity, appropriately curated, can provide an accurate representation of offline behaviour.
translated by 谷歌翻译
Existing federated classification algorithms typically assume the local annotations at every client cover the same set of classes. In this paper, we aim to lift such an assumption and focus on a more general yet practical non-IID setting where every client can work on non-identical and even disjoint sets of classes (i.e., client-exclusive classes), and the clients have a common goal which is to build a global classification model to identify the union of these classes. Such heterogeneity in client class sets poses a new challenge: how to ensure different clients are operating in the same latent space so as to avoid the drift after aggregation? We observe that the classes can be described in natural languages (i.e., class names) and these names are typically safe to share with all parties. Thus, we formulate the classification problem as a matching process between data representations and class representations and break the classification model into a data encoder and a label encoder. We leverage the natural-language class names as the common ground to anchor the class representations in the label encoder. In each iteration, the label encoder updates the class representations and regulates the data representations through matching. We further use the updated class representations at each round to annotate data samples for locally-unaware classes according to similarity and distill knowledge to local models. Extensive experiments on four real-world datasets show that the proposed method can outperform various classical and state-of-the-art federated learning methods designed for learning with non-IID data.
translated by 谷歌翻译